Speaker Clustering and Transformation Forspeaker Adaptation in Large - Vocabulary

نویسندگان

  • M. Padmanabhan
  • L. R. Bahl
  • D. Nahamoo
  • M. A. Picheny
چکیده

1 ABSTRACT A speaker adaptation strategy is described that is based on nding a subset of speakers, from the training set, who are acoustically close to the test speaker, and using only the data from these speakers (rather than the complete training corpus) to re-estimate the system parameters. Further, a linear transformation is computed for every one of the selected training speakers to better map the training speaker's data to the test speaker's acoustic space. Finally, the system parameters (Gaussian means) are re-estimated speciically for the test speaker using the transformed data from the selected training speakers. Experiments showed that this scheme is capable of reducing the error rate by 10-15% with the use of as little as 3 sentences of adaptation data. 2 INTRODUCTION In the last couple of years, several advances have been made in improving the error rate of continuous-speech-recognition systems 1]. For instance, typical word-error rates on test data drawn from the Wall Street Journal database, as reported by diierent participants in the Wall Street Journal task 1], hover in the neighborhood of 12 %, for large-vocabulary speaker-independent systems. Though this represents a reasonable level of performance on this particular test data, there is still scope for further improvement. One way to improve the performance of these systems is to make the system parameters speaker-dependent. However, large-vocabulary systems tend to have a large number of parameters, and in order to robustly estimate these parameters, a large amount of training data is needed. This implies that the test speaker will have to furnish a large amount of data to speciically train the system to his/her speech. This is usually not a practical solution. Consequently, most systems use speaker adaptation techniques, that require only a small amount of data from the test speaker. This data is used to move the parameters of the speaker-independent system towards speaker-dependent values. In this paper, we present a speaker adaptation method that is based on nding a cluster of speakers who are acoustically 'close' to the test speaker, and using these speakers to estimate model parameters, that are closer to the test speaker's data than the speaker-independent model parameters. Further, this method assumes that the basic speech recognition system uses HMM's to model the speech production process, and mixtures of continuous-density Gaussian pdf's to model the output distribution of the HMM's. 3 TECHNICAL BACKGROUND Some adaptation schemes that have been proposed recently …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Studies in transformation-based adaptation

This paper studies the use of transformation-based speaker adaptation in improving the performance of large vocabulary continuous speech recognition systems. We present a formulation of the adaptation procedure that is simpler than existing methods. Our experiments demonstrate that speaker normalization continues to be important even after signi cant amounts of speaker adaptation. An automatic ...

متن کامل

Online Bayesian tree-structured transformation of HMMs with optimal model selection for speaker adaptation

This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform or adapt a set of hidden Markov model (HMM) parameters for a new speaker and gain large performance improvement from a small amount of adaptation data. By constructing a clustering tree of HMM Gaussian mixture components, the linear...

متن کامل

Transform ation and Com bination of Hidden M arkov M odels for Speaker Selection Training

This paper presents a 3-stage adaptation framework based on speaker selection training. First a subset of cohort speakers is selected for test speaker using Gaussian mixture model, which is more reliable given very limited adaptation data. Then cohort models are linearly transformed closer to each test speaker. Finally the adapted model for the test speaker is obtained by combining these transf...

متن کامل

Speaker clustering and transformation for speaker adaptation in speech recognition systems

A speaker adaptation strategy is described that is based on finding a subset of speakers, from the training set, who are acoustically close to the test speaker, and using only the data from these speakers (rather than the complete training corpus) to reestimate the system parameters. Further, a linear transformation is computed for every one of the selected training speakers to better map the t...

متن کامل

On-line Bayesian Tree-structured Transformation of Hidden Markov Models for Speaker Adaptation

This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform (or adapt) the entire set of HMM parameters for a new speaker or new acoustic enviroment from a small amount of adaptation data. By establishing a clustering tree of HMM Gaus-sian mixture components, the nest aane transformation par...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995